t: add lint-style.pl and convert grep to test_grep#2135
Open
mmontalbo wants to merge 6 commits into
Open
Conversation
3258ee2 to
cfa4313
Compare
test_grep is a wrapper around grep for test assertions that prints the file contents on failure for easier debugging. It also accepts '!' as its first argument for negation, which preserves the diagnostic output that '! test_grep' would suppress. Despite being widely used (and the preferred replacement for bare grep in assertions), test_grep has no entry in t/README alongside the other documented helpers like test_cmp and test_line_count. Add one. Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
1f8f6e1 to
5d5366a
Compare
a878d13 to
92fdf29
Compare
Move the Lexer, ShellParser, and ScriptParser packages from chainlint.pl into t/lib-shell-parser.pl so they can be reused by other tools. ScriptParser's check_test() is a no-op in the shared module; callers subclass ScriptParser and override it. chainlint.pl defines TestParser (&&-chain detection) and ChainlintParser (a ScriptParser subclass whose check_test runs TestParser and formats the results). The shared module is loaded via do() for portability with minimal Perl installations. A subsequent commit introduces lint-style.pl which needs the same shell parser to properly tokenize test scripts. Sharing the parser avoids reimplementing heredoc handling, $(...) nesting, pipe tracking, quoting, and test body extraction. Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
scan_dqstring's post-loop newline counter re-counts newlines that were already counted during recursive parsing of $() bodies. This happens because scan_dollar's returned text can contain newlines (from token text of multi-line strings and from \n command separator tokens), and the catch-all counter at the end of scan_dqstring counts all of them again. Fix this by counting newlines inline as non-special characters are consumed, and removing the post-loop catch-all. Each newline is now counted exactly once: literal newlines at the inline match, line splices at the \<newline> handler, and $() newlines by scan_token during the recursive parse. This does not affect chainlint's output because chainlint annotates the original body text using byte offsets, not token line numbers. It does matter for tools like lint-style.pl (introduced in a subsequent commit) that use token line numbers to locate and fix specific lines in the original file. Add check-shell-parser.pl to verify that the Lexer reports correct line numbers after multi-line $() in double-quoted strings. Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
Add a mechanical lint checker for test scripts, similar in spirit to check-non-portable-shell.pl but focused on test conventions rather than portability. The tool defines LintParser, a subclass of ScriptParser (from the shared lib-shell-parser.pl module). ScriptParser's parse_cmd() finds test_expect_success blocks and calls check_test() for each body; LintParser overrides check_test() to run lint rules on the parsed commands. A "# lint-ok" comment suppresses all checks for intentional style violations. The first rule detects '! test_grep' and replaces it with 'test_grep !'. Shell-level negation suppresses the diagnostic output that test_grep prints on failure; the built-in negation preserves it. Three violations inside test bodies are converted via --fix. One additional violation in a helper function outside test_expect_success (t7900's test_geometric_repack_needed) is converted manually, since the parser only processes test bodies. Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
Three grep assertions were missing their file arguments, causing them to read from empty stdin instead of the intended file: - t2402: '! grep ...' should read from 'out', matching the grep on the preceding line. - t7507: the closing quote is in the wrong place, making the entire 'diff --git actual' a single pattern with no file argument instead of pattern 'diff --git' and file 'actual'. - t7700: '! grep ...' should read from 'packlist', matching the redirect on the preceding line. Without file arguments these greps always succeed (empty stdin matches nothing), so the assertions were not actually checking anything. All three tests pass with the corrected file arguments, confirming the intended behavior is sound. Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
92fdf29 to
e509df3
Compare
Extend lint-style.pl with a rule that detects bare 'grep' used as a
test assertion and converts it to test_grep. test_grep prints the
file contents on failure, making test debugging significantly easier.
parse_commands() is extended to split at shell structural tokens
({, }, (, ), |) and keywords (if, then, for, etc.), and each
command gains a token_pos index so that rules can scan backward and
forward in the token stream for context.
Three new functions implement the grep-assertion rule:
- is_filter_context() scans the surrounding tokens for pipes,
control-flow keywords (if/elif/while/until), for-in value
lists, and brace groups with output redirects.
- is_grep_assertion() classifies a grep command: convertible
assertion (pattern and file present), filter (not an assertion),
or missing file argument (flagged as a likely bug).
- check_bare_grep() ties them together and calls
report_violation() with the appropriate fix.
The --fix mode handles:
- Replacing 'grep' with 'test_grep'
- Moving negation from '! grep' to 'test_grep !'
- Stripping the -q flag (test_grep inherently checks match status)
Five files require '# lint-ok' annotations for intentional grep
usage that cannot be mechanically converted: t1400 (packed-refs
may not exist on reftable), t3901 (piped stdin via case block),
t6437 (glob argument breaks test_grep's test -f check), t7450
(file may not exist after failed MINGW clone), and t7527 ($?
capture on the next line).
The test-lint-style scope is extended to include sourced test
fragments in subdirectories (t5411/*.sh and similar) via a new
TSOURCED variable.
Run '--fix' to convert all ~2800 grep assertions across ~340 files
in the test suite. test-lib-functions.sh and lib-rebase.sh are
excluded from linting since they implement test infrastructure
rather than test assertions.
Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
e509df3 to
1527293
Compare
Author
|
/preview |
|
Preview email sent as pull.2135.git.1780547320.gitgitgadget@gmail.com |
Author
|
/submit |
|
Submitted as pull.2135.git.1780559158.gitgitgadget@gmail.com To fetch this version into To fetch this version to local tag |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The test suite has a test_grep wrapper that prints file contents on
assertion failure, making debugging easier. Many tests still use
bare 'grep' for assertions, which silently swallows context on
failure.
This series adds a lint tool (lint-style.pl) to mechanically detect
and convert these, then applies it across the test suite.
The tool reuses the shared shell parser (t/lib-shell-parser.pl)
rather than reimplementing shell parsing. This gives us proper
handling of heredocs, $(...), pipes, quoting, case patterns, and
control flow structures for free. The only new parsing logic is
the grep assertion classifier, which works on the already-parsed
token stream.
The classifier distinguishes grep used as an assertion (checking
exit status against PATTERN + FILE) from grep used as a filter
(pipes, redirects, $(), control flow conditions). Only assertions
are converted. Greps at assertion level with no file argument are
flagged as likely bugs; three pre-existing instances were found
and fixed in patch 5.
The --fix mode converts mechanically: 'grep' becomes 'test_grep',
'! grep' becomes 'test_grep !'.
Structure:
1/6 t/README: document test_grep helper
2/6 t: extract chainlint's parser into shared module
3/6 t: fix Lexer line count for $() inside double-quoted strings
4/6 t: add lint-style.pl with test_grep negation rule
5/6 t: fix grep assertions missing file arguments
6/6 t: lint and convert grep assertions to test_grep
Patch 2 extracts chainlint.pl's Lexer, ShellParser, and
ScriptParser into t/lib-shell-parser.pl. ScriptParser's
check_test() becomes a no-op in the module; chainlint.pl defines
ChainlintParser (a ScriptParser subclass that runs TestParser for
&&-chain detection), and lint-style.pl defines LintParser (a
ScriptParser subclass that runs grep lint rules).
Patch 3 fixes a pre-existing bug in scan_dqstring where the
post-loop newline counter re-counted newlines that were already
counted during recursive $() parsing. The fix counts newlines
inline as non-special characters are consumed, removing the
catch-all counter entirely. chainlint is unaffected (it uses
byte offsets), but lint-style.pl needs accurate token line
numbers to locate and fix specific lines in the original file.
Patch 4 introduces the lint-style.pl framework (LintParser
subclass, Makefile targets, fixture infrastructure) with a small,
complete rule (! test_grep -> test_grep !) so reviewers can see
the machinery in action before the bulk conversion.
Patch 5 fixes three test bugs where grep assertions were missing
their file arguments, causing them to pass vacuously (all three
pass with the corrected arguments). These were independently
discovered by the missing-file detection rule introduced in
patch 6.
Patch 6 adds the main rule, its fixtures, and the mechanical
conversion of ~2800 assertions across ~340 files, including sourced
test fragments in t/t5411/ and heredoc test bodies in t/t5564/.
To verify the conversion (patch 6 adds both the rule and the
mechanical conversion in the same commit, so apply the full
series and re-run --fix to confirm it produces no further
changes):
git checkout &&
perl t/lint-style.pl --fix t/t*.sh t/test-lib*.sh t/lib-.sh
t/-tests.sh t/perf/.sh t/t5411/.sh
As an independent completeness check:
grep -rn '^\s*!\sgrep\s' t/t*.sh t/lib-.sh t/-tests.sh
t/test-lib*.sh t/perf/*.sh |
grep -v test_grep | grep -v lint-ok | grep -v '#.*grep'
Future rules:
The framework is designed to make adding new rules cheap. Each
rule is a function that receives parsed commands and the token
stream. The harness handles tokenization, line mapping, --fix,
and fixture testing. Three natural follow-ups:
'test_must_fail grep': test_must_fail distinguishes expected
failures from crashes (signals), which only makes sense for
git commands. Using it with grep or test_grep should be
'! grep' or 'test_grep !' instead.
'! git cmd': should use test_must_fail, which
distinguishes controlled failures from crashes. The README
explicitly documents this as a "don't".
'test -f' / 'test -d' / 'test -e': should use
test_path_is_file, test_path_is_dir, test_path_exists.
The helpers print the actual directory listing on failure;
bare 'test' just says "failed".
Known limitations:
One grep in t/t1400 asserts against .git/packed-refs which
does not exist on the reftable backend; suppressed with
'# lint-ok'.
One grep in t/t7450 checks a path inside a clone that may
have failed (MINGW-only test); the file may not exist, so
test_grep's existence check would trip; suppressed with
'# lint-ok'.
Two greps in t/t3901 inside case branches that inherit piped
stdin from two lines above are suppressed with '# lint-ok'.
One grep in t/t6437 uses glob expansion (grep -q content *)
which breaks test_grep's file check; suppressed with '# lint-ok'.
One grep in t/t7527 captures $? for later use rather than
asserting inline; suppressed with '# lint-ok'.
cc: "D. Ben Knoble" ben.knoble@gmail.com, Eric Sunshine sunshine@sunshineco.com